Batch-Mode Active Learning via Error Bound Minimization
نویسندگان
چکیده
Active learning has been proven to be quite effective in reducing the human labeling efforts by actively selecting the most informative examples to label. In this paper, we present a batch-mode active learning method based on logistic regression. Our key motivation is an out-of-sample bound on the estimation error of class distribution in logistic regression conditioned on any fixed training sample. It is different from a typical PACstyle passive learning error bound, that relies on the i.i.d. assumption of example-label pairs. In addition, it does not contain the class labels of the training sample. Therefore, it can be immediately used to design an active learning algorithm by minimizing this bound iteratively. We also discuss the connections between the proposed method and some existing active learning approaches. Experiments on benchmark UCI datasets and text datasets demonstrate that the proposed method outperforms the state-of-the-art active learning methods significantly.
منابع مشابه
Dynamic Batch Mode Active Learning via L1 Regularization
We propose a method for dynamic batch mode active learning where the batch size and selection criteria are integrated into a single formulation.
متن کاملAn extension of the Chernoff-based transformation matrix estimation method for on-line learning in Bayesian binary hypothesis tests
In a previous paper [8] we have proposed a method to improve the classification between two classes in a new transformed space using the Chernoff similarity measure. The key idea is to estimate a transformation matrix such that the overlap between the pdf associated to the competing classes is minimum thus leading to a minimization of the classification error. Starting from a surrogate cost fun...
متن کاملConvex Batch Mode Active Sampling via alpha-relative Pearson Divergence
Active learning is a machine learning technique that trains a classifier after selecting a subset from an unlabeled dataset for labeling and using the selected data for training. Recently, batch mode active learning, which selects a batch of samples to label in parallel, has attracted a lot of attention. Its challenge lies in the choice of criteria used for guiding the search of the optimal bat...
متن کاملActive Instance Sampling via Matrix Partition
Recently, batch-mode active learning has attracted a lot of attention. In this paper, we propose a novel batch-mode active learning approach that selects a batch of queries in each iteration by maximizing a natural mutual information criterion between the labeled and unlabeled instances. By employing a Gaussian process framework, this mutual information based instance selection problem can be f...
متن کاملCounterfactual Risk Minimization
We develop a learning principle and an efficient algorithm for batch learning from logged bandit feedback. Unlike in supervised learning, where the algorithm receives training examples (xi, y ∗ i ) with annotated correct labels y ∗ i , bandit feedback merely provides a cardinal reward δi ∈ R for the prediction yi that the logging system made for context xi. Such bandit feedback is ubiquitous in...
متن کامل